Ektron CMS400.Net Reference

>>Helping Users Navigate Your Web Site > Searching Your Web Site > Web Site Search > Query Language

Query Language

 

Note: Following text collected from Microsoft Web site. © 2006 by Microsoft Corporation. All rights reserved.

To search for any word or phrase on a Web site, enter the word or phrase into the Search field and click the button to begin the search.

Rules for Formulating Queries

When a search is executed, it returns a list of Web pages that contain the word or phrase that a user entered, regardless of where it appears in text.

Follow these rules when formulating queries.

Multiple words are treated as individual search terms. So, the term calendar server returns pages that have both words.

To find pages that have calendar and server in that exact order, use quotes. “calendar server” returns pages that include both terms in that exact order with no intervening words.

Queries are case-insensitive. You can type a query in upper or lower case.

The search ignores words in the noise files. Ektron CMS400.NET’s noise files screen from the search every single letter of the alphabet as well as many common words, such as about, after, all, and also. Avoid entering such words into the Search Text field, because the search ignores them.

The list of noise words (noise.enu) is installed to your siteroot/Workarea and Windows/System 32 folders. You can open a noise file with a simple word processor program, such as Notepad, to view the noise words. You can also edit the files. For example, you can remove words that users should be able to search on.

To make a word in the noise files searchable, remove it from both files then restart both catalogs.

Exceptions:

- In the Ektron CMS400.NET Workarea, the Advanced Search disregards the noise file.

- You cannot make the indexing service operators (and, or, but) searchable.

Words in the noise file are treated as placeholders in phrase and proximity queries. For example, if you search for “Word for Windows”, the results could return “Word for Windows” and “Word and Windows”, because “for” is in the exception list.

Punctuation marks, such as period (.) and comma (,), are ignored by a search.

To use special characters, such as &, |, ^, #, @, $, (,), in a query, enclose the query in quotation marks (“).

To search for a word or phrase containing quotation marks, surround the entire phrase with quotation marks and double the quotation marks around the word to be surrounded with quotes. For example, “World-Wide Web or ““Web””” searches for World-Wide Web or “Web”.

Use Boolean operators (AND, OR) and the proximity operator (NEAR) to specify additional search criteria. See Also: Boolean and Proximity Operators

Use the wildcard character (*) to find words with a given prefix. For example, the query esc* finds Web pages with “ESC,” “escape,” and so on. See Also: Wildcards

You can specify free-text queries without regard to query syntax. See Also: Free-Text Queries

Vector space queries can be specified. See Also: Vector Space Queries

You can search on ActiveX™ (OLE) and file attribute property values. See Also: Property Value Queries

Boolean and Proximity Operators

Use boolean and proximity operators to create a more precise query.

To Search for

Example

Results

Both terms on a page

healthcare insurance

Pages with words “healthcare” and “insurance”

Either term on a page

kidney or renal

—Or—

kidney | renal#

Pages with “kidney” or “renal”

All pages that match a property value

@CMSsize > 1000

Pages greater than 1000 kilobytes

Both terms on a page, close together

treatment near immunoglobulin

—Or—

treatment ~ immunoglobulin

Pages with the word “treatment” near the word “immunoglobulin” See Also: The NEAR operator is like the AND operator because it finds pages that include both search words.

Tips

To nest expressions within a query, add parentheses. Expressions within parentheses are evaluated before the rest of the query.

Use double quotes (“) to ignore a Boolean or NEAR operator keyword. For example, “Abbott and Costello” finds pages with the entire phrase, not pages that match the Boolean expression. In addition to being an operator, the word “and” is a noise word in English.

The NEAR operator is like the AND operator because it finds pages that include both search words.

However, the rank assigned by NEAR depends on the proximity of the search words. A page with the searched-for words closer together has a higher rank than a page where they are farther apart. If the search words are more than 50 words apart, the page is assigned a rank of zero.

 

Note: The NEAR operator can be applied only to words or phrases.

The AND operator has a higher precedence than OR. For example, the first three queries are equal, but the fourth is not:

- a AND b OR c

- c OR a AND b

- c OR (a AND b)

- (c OR a) AND b

Localized Symbols and Keywords

The symbols (&, |,!, ~) and the English keywords AND, OR, and NEAR work the same in all languages supported by Ektron CMS400.NET. Localized keywords are also available when the browser locale is set to one of the following languages.

Language

Keywords

German

UND, ODER, NICHT, NAH

French

ET, OU, SANS, PRES

Spanish

Y, O, NO, CERCA

Dutch

EN, OF, NIET, NABIJ

Swedish

OCH, ELLER, INTE, NÄRA

Italian

E, O, NO, VICINO

Wildcards

Wildcard operators find pages that contain words similar to a given word.

To Search For

Example

Results

Words with the same beginning letters

pharm*

Pages with words that have the prefix pharm, such as pharmaceutical, pharmacist, and pharmacology.

Words based on any form of a verb

fly**

Pages with any form of a verb. For example, if you enter fly, the search returns pages that contain flying, flown, flies, and flew.

Free-Text Queries

A free-text query finds pages that match the meaning, not the exact wording, of submitted words and phrases. Begin free-text queries with $contents.

You cannot use boolean, proximity, or wildcard operators in a free-text query.

To Search For

Example

Results

Files that match free-text

$contents how do I print in Microsoft Excel?

Pages that mention printing and Microsoft Excel

Vector Space Queries

The search supports vector space queries, which return pages that include a list of words and phrases. Each page is ranked according to how well it matches the query.

To Search for

Example

Results

Pages that contain specific words

light, bulb

Files with words that best match the search words

Pages that contain weighted prefixes, words, and phrases

invent*, light[50], bulb[10], “light bulb”[400]

Files that contain words prefixed by “invent,” the words “light,” “bulb,” and the phrase “light bulb” (the terms are weighted)

Tips

Separate terms in a vector query with commas (,)

You can weight terms in vector queries by using the [weight] syntax (see example above)

Pages found by vector queries do not necessarily match all words submitted in the query

Vector queries work best when results are sorted by rank

Property Value Queries

Use a property value query to find files whose property values match a given criteria. Properties you can query include file information (like file name and size), and ActiveX properties, including the document summary stored in files created by ActiveX-aware applications.

There are two types of property queries.

Relational property queries - consist of an “at” character (@), a property name, a relational operator, and a property value. For example, to find all files larger than one million kilobytes, use @CMSsize > 1000000. See Also: Relational Operators

Regular expression property queries - consist of a number sign (#), a property name, and a regular expression for the property value. For example, to find all .avi files, use #filename *.avi. See Also: Regular Expressions

Regular expressions do not match the contents (#contents) and all (#all) properties.

In regular expression property queries, you can only use properties that are retrievable at query time. Properties that are not retrievable include HTML META properties not stored in the property cache.

Property Names

Property names are preceded by the “at” sign (@) for relational queries, and the pound sign (#) for regular expression queries.

If no property name is specified, @contents is assumed.

Properties available for all files are listed below.

Property Name

Description

All

Matches words, phrases, and any property

Contents

Words and phrases in the file

Note: The contents property does not support relational operators. If a relational operator is specified, no results are found. For example, @contents Ektron finds documents containing Ektron, but @contents=Ektron finds none.

Filename

Name of the file

CMSsize

File size

Write

Date file was created or last modified (whichever is later)

You can also use ActiveX property values in queries. You can search for files created by most ActiveX-aware applications by querying for the following properties.

Property Name

Description

DocTitle

Title of the document

DocSubject

Subject of the document

DocAuthor

The document’s author

DocKeywords

Keywords for the document

DocComments

Comments about the document

Relational Operators

Use relational operators to create relational property queries.

To Search for

Example

Results

Property values in relation to a fixed value

@CMSsize < 100

@CMSsize <= 100

@CMSsize = 100

@CMSsize != 100

@CMSsize >= 100

@CMSsize > 100

Files whose size matches the query

Property values with all of a set of bits on

@attrib ^a 0x820

Compressed files with the archive bit on

Property values with some of a set of bits on

@attrib ^s 0x20

Files with the archive bit on

Property Values

To Search for

Example

Results

A specific value

@DocAuthor = “Bill Bailey”

Files authored by Bill Bailey

Values beginning with a prefix

#DocAuthor George*

Files whose author property begins with George

Files of any extension

#filename *.gif

Files with a .gif extension

Note: Because Ektron CMS400.NET stores all content in .txt files, you cannot use this syntax to find files with a .txt extension.

Files modified after a certain date

@write > 2006/02/14

Note: You cannot use the equal operator (=) with @ write. Only greater than (>) and less than (<) operators work.

Files modified after February 14, 2006

Vectors matching a vector

@vectorprop = { 10, 15, 20 }

ActiveX documents with a vectorprop value of {10, 15, 20}

Vectors where each value matches a criterion

@vectorprop >^a 15

ActiveX documents with a vectorprop value in which all values in the vector are greater than 15

Vectors where at least one value matches a criterion

@vectorprop =^s 15

ActiveX documents with a vectorprop value in which at least one value is 15

Tips for Using Property Queries

Use the pound (#) character before the property name when using a regular expression in a property value.

Use the “at” (@) character otherwise. The equal (=) relational operator is assumed for regular-expression queries.

File name (#filename) is the only property that efficiently supports regular expressions with wildcards to the left of text.

Dates use the format yyyy/mm/dd.

You can omit the first two characters of the year. If you do, 29 or less is interpreted as the year 2000, and 30 or greater is interpreted as the year 1900. All dates are in Greenwich Mean Time (GMT).

Currency values use the format x.y, where x is the whole value amount and y is the fractional amount. There is no assumption about units.

Boolean values are (t) or (true) for TRUE and (f) or (false) for FALSE.

Vectors (VT_VECTOR) are expressed as an opening brace ({), a comma-separated list of values, and a closing brace (}).

Single-value expressions that are compared against vectors are expressed as a relational operator, then a (^a) for all of or a (^s) for some of. See Also: Relational Operators

Numeric values can be expressed in decimal or hexadecimal (preceded by 0x).

Regular Expressions

Regular expressions in property queries are defined as follows.

Any character except asterisk (*), period (.), question mark (?), and vertical bar (|) defaults to matching itself.

A regular expression can be enclosed in matching quotes (“). It must be enclosed in quotes if it contains a space () or closing parenthesis ()).

The characters *, ., and ? behave as in Windows. They match any number of characters, match (.) or end of string, and match any one character, respectively.

The character | is an escape character. After |, the following characters have special meaning:

- (opens a group. Must be followed by a matching).

- ) closes a group. Must be preceded by a matching (.

- [opens a character class. Must be followed by a matching (un-escaped)].

- {opens a counted match. Must be followed by a matching}.

- } closes a counted match. Must be preceded by a matching {.

- , separates OR clauses.

- * matches zero or more occurrences of the preceding expression.

- ? matches zero or one occurrences of the preceding expression.

- + matches one or more occurrences of the preceding expression.

Anything else, including |, matches itself.

Between square brackets ([]), the following characters have special meaning.

- ^ matches everything but following classes. Must be the first character.

- ] matches]. May only be preceded by ^. Otherwise, it closes the class.

- - range operator. Preceded and followed by normal characters.

Anything else matches itself (or begins or ends a range at itself).

Between curly braces ({}), the following syntax applies.

- |{m|} matches exactly m occurrences of the preceding expression. (0 < m < 256).

- |{m,|} matches at least m occurrences of the preceding expression. (1 < m < 256).

- |{m,n|} matches between m and n occurrences of the preceding expression, inclusive. (0 < m < 256, 0 < n < 256).

To match *, ., and?, enclose them in brackets (for example, |[*]sample matches “*sample”).

Query Examples

Example

Results

@CMSsize > 10000

Pages larger than 10 MB

@write > 2003/05/12

Note: You cannot use the equal operator (=) with @ write. Only greater than (>) and less than (<) operators work.

Pages modified after the date

pear tree

Pages with the phrase “pear tree”

“pear tree”

Same as above

@contents pear tree

Same as above

Ektron and @CMSsize > 10000

Pages with the word “Ektron” that are larger than 10 thousand kilobytes

“Ektron and @CMSsize > 10000"

Pages with the phrase specified (not the same as above)

#filename *.avi

Video files (the # prefix is used because the query contains a regular expression)

@attrib ^s 32

Pages with the archive attribute bit on

@docauthor = “John Stanton”

Pages with the given author

$contents why is the sky blue?

Pages that match the query


Visit the Ektron Dev Center at http://dev.ektron.com 1-866 - 4 - EKTRON

Ektron CMS400.NET Reference Version 8.02 SP1 Rev 1

Ektron Documentation,© 2011 Ektron, Inc.